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D ■ Abstract 



This paper studies the continuous-time distributed optimization of a sum of convex functions over 
directed graphs. Contrary to what is known in the consensus literature, where the same dynamics 
works for both undirected and directed scenarios, we show that the consensus-based dynamics that 
. solves the continuous-time distributed optimization problem for undirected graphs fails to converge 

when transcribed to the directed setting. This study sets the basis for the design of an alternative 
distributed dynamics which we show is guaranteed to converge, on any strongly connected weight- 
balanced digraph, to the set of minimizers of a sum of convex differentiable functions with globally 
Lipschitz gradients. Our technical approach combines notions of invariance and cocoercivity with the 
positive definiteness properties of graph matrices to establish the results. 

o . 

^ ■ I. Introduction 

Distributed optimization of a sum of convex functions has applications in a variety of scenarios, 
including sensor networks, source localization, and robust estimation, and has been intensively 
studied in recent years, see e.g. jT), [0, [0, 0), [0, fl6l, [fTTTl . Most of these works build on 
consensus-based dynamics 0, (S), flU, IfTOl to design discrete-time algorithms that find the 
solution of the optimization problem. A recent exception are the works [[121 . [[131 that deal with 
continuous-time strategies on undirected networks. This paper furthers contributes to this body of 
work by studying continuous-time algorithms for distributed optimization in directed scenarios. 

The unidirectional information flow among agents characteristic of directed networks often 
leads to significant technical challenges when establishing convergence and robustness properties 
of coordination algorithms. The results of this paper provide one more example in support of 
this assertion for the case of continuous-time consensus-based distributed optimization. This is 
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somewhat surprising given that, for consensus, the same dynamics works for both undirected 
connected graphs and strongly connected, weight-balanced directed graphs, see e.g., [Q, [0. 

The contributions of this paper are the following. We first show that the solutions of the 
optimization problem of a sum of locally Lipschitz convex functions over a directed graph 
(or digraph) correspond to the saddle points of an aggregate objective function that depends 
on the graph topology through its Laplacian. This function is convex in its first argument 
and linear in the second. Moreover, its gradient is distributed when the graph is undirected. 
Our second step is then to study the convergence properties of the saddle-point dynamics and 
establish its asymptotic correctness when the original functions are locally Lipschitz (i.e., not 
necessarily differentiable) and convex, extending the results available in the literature [13] for 
continuously differentiable, strictly convex functions. Next, we consider the optimization problem 
over digraphs. We first provide an example of a strongly connected, weight-balanced digraph 
where the distributed version of the saddle-point dynamics does not converge. This motivates 
us to introduce a generalization of the dynamics that incorporates a design parameter. We show 
that, when the original functions are differentiable and convex with globally Lipschitz gradients, 
the design parameter can be appropriately chosen so that the resulting dynamics asymptotically 
converge to the set of minimizers of the objective function on any strongly connected and weight- 
balanced digraph. Our technical approach combines notions and tools from set-valued stability 
analysis, algebraic graph theory, and convex analysis. 

II. Preliminaries 

We start with notational conventions. Let K. and M>o denote the set of reals and nonnegative 
reals, respectively. We let 1 1 • 1 1 denote the Euclidean norm on W L . We let l d = (1, . . . , 1) T , 0^ = 
(0, . . . , 0) T G R d , and \ d denote the identity matrix in R dxd . For A G ]R dlXd2 and B G M eiXe2 , 
A®B is their Kronecker product. A function / : X : x X 2 ->■ M, with X : C E dl , X 2 C R^ 2 closed 
and convex, is concave-convex if it is concave in its first argument and convex in the second 
one. A saddle point (x\,x* 2 ) G X 1 x X 2 of / satisfies f(xi,x* 2 ) < f(x\,x* 2 ) < f(x\,x 2 ) for all 
x\ G Xi and i 2 G X 2 . A set- valued map / : R d =4 M d takes elements of M. d to subsets of M. d . 

A. Graph theory 

We present basic notions from algebraic graph theory [0. A directed graph, or digraph, is a 
pair Q = (V, £ ), where V is the (finite) vertex set and £ C V x V is the edge set. A digraph is 
undirected if (v, u) G £ anytime (n, v) G £. We refer to an undirected digraph as a graph. A 
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path is an ordered sequence of vertices such that any pair of vertices appearing consecutively 
is an edge. A digraph is strongly connected if there is a path between any pair of distinct 
vertices. For a graph, this notion is referred to as connected. A weighted digraph is a triplet 
Q = (V, £,A), where (V, E) is a digraph and A G R>q™ is the adjacency matrix, satisfying 
dij > if (vi,Vj) E E and = 0, otherwise. The weighted out-degree and in-degree of Vi, 
i E {1, • • • , n}, are respectively, d™ ut (vi) = YTj=i a ij an ^ d? n ( v i) = YTj=i a ji- The weighted out- 
degree matrix D out is diagonal with (D out )jj = d% ut (i), for i E {1, . . . , n}. The Laplacian matrix is 
L = D out — A. Note that Ll n = 0. If Q is strongly connected, then zero is a simple eigenvalue of 
L. Q is undirected if L = L T and weight-balanced if d™ ut (v ) = d* n (v ), for all v G V. The following 
three notions are equivalent: (i) is weight-balanced, (ii) l^L = 0, and (iii) L + L T is positive 
semidefinite, see e.g., flU Theorem 1.37]. If Q is weight-balanced and strongly connected, then 
zero is a simple eigenvalue of L + L T . Any undirected graph is weight-balanced. 

B. Nonsmooth analysis 

We recall some notions from nonsmooth analysis lfl"5l . A function / : M. d — > R is locally 
Lipschitz at x G M. d if there exists a neighborhood U of x and G IR>o such that \f(y) — f(z)\ < 
C x \\y — z\\, for y,z G U. f is locally Lipschitz on M d if it is locally Lipschitz at x for all 
x E W 1 and globally Lipschitz on M d if for all y, z E W 1 there exists C E IR>o such that 
\f(y) ~ f( z ) \ < C\\y — z\\. Locally Lipschitz functions are differentiable almost everywhere. If 
£1 f denotes the set of points where / fails to be differentiable, the generalized gradient of / is 

df(x) = co{ lim V/(x fe ) | x k -> x,x k <£ ft f U S}, 

where S is any set of measure zero and co denotes convex hull. 

Lemma 2.1: (Continuity of the generalized gradient map): Let / : t d -> R be a locally 
Lipschitz function at x E M d . Then the set-valued map Of : M. d =4 M d is upper semicontinuous 
and locally bounded at x E M. d and moreover, df(x) is nonempty, compact, and convex. 

For / : W 1 x M. d — > R and z E M. d , we let d x f(x, z) denote the generalized gradient of x i— > 
f(x, z). Similarly, for x E M d , we let d z f(x, z) denote the generalized gradient of z •->■ /(x, z). 
A critical point x E W 1 of / satisfies G df(x). A function / : ~§t d — > R is regular at a; G R if 
for all f G R d the right directional derivative of /, in the direction of v, exists at x and coincides 
with the generalized directional derivative of / at x in the direction of v, see [fT5l for definitions 
of these notions. A convex and locally Lipschitz function at x is regular f[T5l Proposition 2.3.6]. 
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Lemma 2.2: (Finite sum of locally Lipschitz functions): Let {/ l }™ =1 be locally Lipschitz at x G 
R d . Then <9(^™ =1 f % ){x) C J]" =1 df l (x), and equality holds if / l is regular for i e {l,...,n} 
(here, the summation of sets is the set of points of the form Y^h=i w * tn 9* e df l (x)). 
A locally Lipschitz and convex function / satisfies, for all x,x' G R d and £ G df(x), the 
first-order condition of convexity, 

f(x')-f(x) >f(x'-x). (1) 

The notion of cocoercivity lfl6l plays a key role in our technical approach later. For S G R>o, a 
locally Lipschitz function / is 5-cocoercive if, for all x,x' G R d and G df(x), g x < G df(x'), 

(x - x') T (g x - g x .) > 5(g x - g x >) T (g x - g x >). 

The next result lfT6l Lemma 6.7] characterizes cocoercive differentiable convex functions. 

Proposition 2.3: (Characterization of cocoercivity): Let / be a differentiable convex function. 
Then, V/ is globally Lipschitz with constant K G R>o iff / is -^-cocoercive. 

C. Set-valued dynamical systems 

Here, we recall some background on set- valued dynamical systems following [fTTl . A continuous- 
time set-valued dynamical system on X C l rf is a differential inclusion 

x(t) G * (x(t)) (2) 

where t G IR>o and \& : X C W 1 =} IR d is a set-valued map. A solution to this dynamical system 
is an absolutely continuous curve x : [0,T] — > X which satisfies (f2l) almost everywhere. The set 
of equilibria of © is denoted by Eq(^) = {x G X | G 

Lemma 2.4: (Existence of solutions): For \1> : M d =} M d upper semicontinuous with nonempty, 
compact, and convex values, there exists a solution to © from any initial condition. 

The LaSalle Invariance Principle is helpful to establish the asymptotic convergence of systems 
of the form ©. A set W C X is weakly positively invariant under © if, for each x G W, there 
exists at least one solution of (f2l) starting from x entirely contained in W. Similarly, W is 
strongly positively invariant under © if, for each x G W, all solutions of starting from x 
are entirely contained in W. Finally, the set-valued Lie derivative of a differentiable function 
V : R d -)> R with respect to * at a; G l rf is = {w T Vl / (x) | v G 

Theorem 2.5: (Set-valued LaSalle Invariance Principle): Let W C X be strongly positively 
invariant under © and V : X — >■ R a continuously differentiable function. Suppose the evolutions 



December 24, 2012 



DRAFT 



5 



of © are bounded and maxC$,V(x) < or CyV(x) = 0, for all x E W. Let Sy iV = {x E 
X | E CqV(x)}. Then any solution x(t), t E R> , starting in W converges to the largest 
weakly positively invariant set M contained in S^y H W. When M is a finite collection of 
points, then the limit of each solution equals one of them. 

III. Problem statement and equivalent formulations 

Consider a network composed by n agents vi,...,v n whose communication topology is 
described by a strongly connected digraph Q. An edge iy^Vj) represents the fact that t>, can 
receive information from Vj. For each i E {1, . . . , n}, let f l : R rf — > R be locally Lipschitz and 
convex, and only available to agent i^. The network objective is to solve 

n 

minimize f (x) = f l (x) , (3) 

i=i 

in a distributed way. Let x % E R d denote the estimate of agent Vi about the value of the solution 
to © and let x T = ((x 1 ) T , . . . , (x n ) T ) E R nd . Next, we provide an alternative formulation of (0. 

Lemma 3,1: Let L E R nxn be the Laplacian of Q and define L = L <g> \ d E W ldxnd . The 
problem © on M. d is equivalent to the following problem on M. nd , 

n 

minimize f(x) = ^ P{x l ), subject to Lx = nd . (4) 

8=1 

Proof: The proof follows by noting that (i) /(l n ®x) = f(x) for all x E W 1 and (ii) since 
Q is strongly connected, Lx = nd if and only if x — 1„ <g) x, for some x E ~§i d . ■ 

The formulation © is appealing because it brings together the estimates of each agent about 
the value of the solution to the original optimization problem. Note that / is locally Lipschitz 
and convex. Moreover, from Lemma [2T2l the elements of its generalized gradient are of the form 
9x = (fl£i, • • • , #"n) E df(x), where g l xl E df (x l ), for i E {1, . . . ,n}. Since / is convex and 
the constraints in © are linear, the constrained optimization problem is feasible [18|. 

The next result introduces a function which corresponds to the Lagrangian function associated 
to the constrained optimization problem © plus an additional quadratic term that vanishes if 
the agreement constraint is satisfied. Interestingly, the saddle points of this function correspond 
to the solutions of the constrained optimization problem, as we show next. 

Proposition 3.2: (Solutions of the distributed optimization problem as saddle points): Let Q be 
strongly connected and weight-balanced, and define F : R nd x R nd — > R by 

F(x, z) = f(x) + x T Lz + ^x T Lx. (5) 
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Then F is locally Lipschitz and convex in its first argument and linear in its second, and 

(i) if (x*, z*) is a saddle point of F, then so is (af, z* + l n <g> a), for any a G R d . 

(ii) if (aj*, z*) is a saddle point of F, then a;* is a solution of (HJ). 

(iii) if x* is a solution of ©, then there exists z* with Lz* G —df(x*) such that (cc*,2*) is 
a saddle point of F. 

Proof: First, note that for Q weight-balanced, L + L T is positive semi-definite. Since the 
sum of convex functions is convex, one deduces that F is convex in its first argument. By 
inspection, F is linear in its second argument. The statement (i) is immediate. To show (ii), 
using that Q is strongly connected, one can see that the saddle points of F are of the form 
(x*, z*) with * G Mr, and Lz* G —df(x*). The last inclusion implies that there 

exist gl* G df(x*), % G {1, . . . , n}, such that Lz* = — (<j£., . . . , g™*) T - Noting that 

(l£ g> l d )L = (l£ ® l d )(L ® l d ) = ill ® l rf = dxdn , 

we deduce 0^ = (1^ (8) ld)Lz* = — ^™ =1 ^ s a result, using Lemma [2T2l a;* is a solution 
of (HI). Finally, (iii) follows by noting x* = l n <g> a;* and the fact that G df(x*) implies that 
there exists z* G W ld with Lz* G —df(x*), yielding that (x*,z*) is a saddle point of F. ■ 

IV. Continuous-time distributed optimization on undirected networks 

Here, we review the continuous-time solution to the optimization problem proposed in lfl2l . 
[fT3ll for undirected graphs. If Q is undirected, the gradient of F in © is distributed over Q. 
Given Proposition 13 .21 it is natural to consider the saddle-point dynamics of F to solve ©, 

x + Lx + Lz G -df(x) , (6a) 
z = Lx. (6b) 

Note that © is a set-valued dynamical system. Using Lemmas 12.11 and 12.41 one can guarantee 
the existence of solutions. Moreover, from Proposition 13.21 if (x*,z*) is an equilibrium of ©, 
then x* is a solution to §4§. According to [[T3~l . the dynamics © leads the network to agree on 
a global minimum of / for the case when Q is undirected and / is both strictly convex and 
the sum of differentiable convex functions. We extend here this result to the case when Q is 
undirected and / is the sum of locally Lipschitz convex functions. The proof is also useful later 
to illustrate the challenges in solving the distributed optimization problem over directed graphs. 

Theorem 4.1: (Asymptotic convergence of © on graphs): Let Q be a connected graph and 
consider the optimization problem ®, where each /*, i G {!,..., n} is locally Lipschitz and 
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convex. Then, the projection onto the first component of any trajectory of © asymptotically 
converges to the set of solutions to ©. Moreover, if / has a finite number of critical points, the 
limit of the projection onto the first component of each trajectory is a solution of ©. 

Proof: For convenience, we denote the dynamics © by ^as-opt : x W ld M. nd x M. nd . 
Let x* = l n £g>x* be a solution of ©. By Proposition |3.2r iii), there exists z* such that (x*, z*) G 
Eq( x I / dis-opt)- First, note that given any initial condition (aj , z ) e R nd x R nd , the set 

W zo = {(x, z) | {l T n ® \ d )z = {l T n ® \ d )z } (7) 

is strongly positively invariant under ©. Consider then the function V : M. nd x M. nd — > M> , 

V(x, z) = \(x- x*) T (x - x*) + l{z - z*) T (z - z*). (8) 

The function V is smooth. Let us examine its set-valued Lie derivative. For each £ 6 £^ dis t V(x, z), 
there exists v = (— Lx — Lz — ^a,, La?) e ^dis-o P t(^, with ^ G df(x), such that 

£ = /VF(x, z) = -(x - x*) T (Lx + Lz + g x ) + (z - z*) T Lx. (9) 

Since F is convex in its first argument and Lx + Lz + ^ G d x F(x,z), using the first-order 
condition of convexity ([I]), we deduce (x* — x) T (Lx + Lz + g x ) < F(x*, z) — F(x, z). On the 
other hand, the linearity of F in its second argument implies that (z — z*) T ~Lx = F(x,z) — 
F(x,z*). Therefore, £ < F(x*,z) — F(x*,z*) + F(x*,z*) — F(x,z*). Since the equilibria 
of ^dis-opt are the saddle points of F, we deduce that £ < 0. Since £ is arbitrary, we conclude 
max&ff^ V(x, z) < 0. As a by-product, the trajectories of © are bounded. Consequently, 
all assumptions of the set-valued version of the LaSalle Invariance Principle, cf. Theorem 12.51 
are satisfied. This result then implies that any trajectory of © starting from an initial condition 
(x , z ) converges to the largest weakly positively invariant set M in v H W zo . Our final 

step consists of characterizing M. Let (x, z) G M. Then F(x*,z*) — F(x,z*) = 0, i.e., 

f(x*) - f{x) - (z*) T Lx - ^x T Lx = 0. (10) 

Define now G : R nd x W ld -»■ K by G(x, z) = f(x) + z T Lx. Note that G is convex in 
its first argument and linear in its second, and that it has the same saddle points as F. As a 
result, G(x*,z*) — G(x,z*) < 0, or equivalently, f(x*) — f(x) — (z*) T Lx < 0. Combining 
this with (flOl) . we have Lx = and —f(x) + f(x*) = 0, i.e., x is solution to ©. Since M 
is weakly positively invariant, there exists at least a solution of © starting from (x, z) that 
remains in M. This implies that, along the solution, the components of x remain in agreement, 
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i.e., x(t) = l n <g> a(t) with a{t) G M. d a solution of ©. Applying 1^ ® l d on both sides of 
l n <8> a(t) + Lz G —df(x(t)), we deduce na(i) G — Y^h=i df l ( a (t))- Lemma [A72l then implies 
that a{t) = 0, i.e., Lz G —df(x) and thus (a;, z) G Eq(\E'dis-opt)- Finally, if the set of equilibria 
is finite, the last statement holds true. ■ 
Remark 4.2: (Asymptotic convergence of saddle-point dynamics): The work ||20l studies saddle- 
point dynamics and guarantees asymptotic convergence to a saddle point when the function's 
Hessian in one argument is positive definite and the function is linear in the other. Such result, 
however, cannot be applied to establish Theorem 14.11 because the generality of the hypotheses 
on / mean that F might not satisfy these conditions. Instead, our proof shows that a careful 
study of the invariance properties of the flow yields the desired result. • 

V. Continuous-time distributed optimization on directed networks 

Here, we consider the optimization problem © on digraphs. When Q is directed, the gradient 
of F defined in © is no longer distributed over Q because it contains terms that involve L T and 
hence requires agents to receive information from its in-neighbors. In fact, the dynamics ©, 
which is distributed over Q, does no longer correspond to the saddle-point dynamics of F. 
Nevertheless, it is natural to study whether © enjoys the same convergence properties as in the 
undirected setting (as, for instance, is the case in the agreement problem ||7), flU). Surprisingly, 
this turns out not to be the case, as shown in Section IV-AI This result motivates the introduction 
in Section IV-B I of an alternative provably correct dynamics on weight-balanced directed graphs. 

A. Counterexample 

Here, we provide an example of a strongly connected, weight-balanced digraph on which © 
fails to converge. For convenience, we let 5 agree = {(l n <8> x, l n <g> z) G M. nd x M nd | x, z G W 1 } 
denote the set of agreement configurations. Our construction relies on the following result. 

Lemma 5.1: (Necessary condition for the convergence of © on digraphs): Let Q be a strongly 
connected digraph and f l — 0, i G {1, . . . , n}. Then <S agree is stable under © iff, for any nonzero 
eigenvalue A of the Laplacian L, one has \/3|Im(A)| < Re(A). 

Proof: By assumption, the dynamics © is linear with matrix ( "j 1 "q 1 ) <g> L and has 5 agre e 
as equilibria. The eigenvalues of the matrix are of the form A (-y'i yi), with A an eigenvalue 
of L (because the eigenvalues of a Kronecker product are just the product of the eigenvalues 
of the corresponding matrices). Since L = L <g> \ d , each eigenvalue of L is an eigenvalue of L. 
Finally, Re(A(^ ± ^-i)) = |(Tv / 3Im(A) - Re(A)), from which the result follows. ■ 
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It is not difficult to construct examples of convex functions that have zero contribution to the 
linearization of © around the solution. Therefore, such systems cannot be convergent if they 
fail the criterium identified in Lemma 15.11 The next example shows that this criterium can fail 
even for strongly connected weight-balanced digraphs. 

Example 5.2: Consider the strongly connected, weight-balanced digraph with 
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as adjacency matrix. Note that A = 0.8833 ± 0.5197z is an eigenvalue of the Laplacian. Since 
\/3|Im(A)| - Re(A) = 0.0171 > 0, Lemma O implies that © fails to converge. • 

B. Provably correct distributed dynamics on directed graphs 

Here, given the result in Section IV-A1 we introduce an alternative continuous-time distributed 
dynamics for strongly connected weight-balanced digraphs. For reasons that will be made clear 
later in Remark 15.51 we restrict our attention to the case when the functions f \ i G {1, . . . , n} 
are continuously differentiable. Let a G M. >0 and consider the dynamics 

x + aLx + Lz = -Vf(x), (11a) 
z = Lx. (lib) 

The existence of solutions is guaranteed by Lemmas 12.11 and 12.41 We first show that appropriate 
choices of a allow to circumvent the problem raised in Lemma 15.11 

Lemma 5.3: (Sufficient conditions for the convergence of (fTTj) on digraphs with trivial objective 
function): Let Q be a strongly connected and weight-balanced digraph and /* = 0, % G {1, . . . , n}. 
If a > 2v^2, then S agvee is asymptotically stable under dTTb . 

Proof: When all i G {1, . . . , n}, are identically zero, the dynamics ([Til is linear and has 
<S>a g ree as equilibria. Consider the coordinate transformation from (x, z) to (x, y) = (x, (3x + z), 
with f3 G IR>o to be chosen later. The dynamics can be rewritten as 

fx\ fx\ ( -(a-B)L -L\ 

= A\ , where A = V ; . (12) 

W \yj \(-/3(a-/3) + l)L -PL J 
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Consider the candidate Lyapunov function V(x, y) = x T x + y T y. Its Lie derivative is the 
quadratic form defined by the matrix 

-(a -fi)(L + L T ) — L + (-P(a -(3) + l)lT 



Q — hnd,A + A T \ 2n d , 

{-(3{a -0) + 1)L - L -P(L + L ^ 



Select f3 now satisfying /3 2 — a/3 + 2 = (this equation has a real solution if a > 2y2). Then, 



/3 2 +2 



~P) "I 



O = \ P ^ 1 ® (L + L T ). (13) 



Each eigenvalue 77 of Q is of the form rj = A ^ , where A is an eigenvalue of 

L + L T . Since Q is strongly connected and weight-balanced, L + L T is positive semidefinite with 
a simple eigenvalue at zero, and hence 77 < 0. By the LaSalle invariance principle, the solutions 
of CCD) from any initial condition (x , y ) E W nd x IR nd , asymptotically converge to the set 
S = {{x, y) I Q(x, y) T = 2 n<i}n W Zo . To conclude the result, we need to show that S C 5 agr ee. 
This follows from noting that, for p > 0, Q(a3, y) T = 02nd implies that (L + L T )x = n d and 
(L + L T )y = nd , i.e., (x, y) G S agiee . U 

The reason behind the introduction of the parameter a in CCD) comes from the following 
observation: if one tries to reproduce the proof of Theorem 14.11 for a digraph, one encounters 
indefinite terms of the form (x — x*) T (L — L T )(;z — z*) in the Lie derivative of V, invalidating 
it as a Lyapunov function. However, the proof of Lemma 15.31 shows that an appropriate choice 
of a, together with a suitable change of coordinates, makes the quadratic form defined by the 
identity matrix a valid Lyapunov function. We next build on these observations to establish our 
main result: the dynamics CCD solves in a distributed way the optimization problem © on 
strongly connected weight-balanced digraphs. 

Theorem 5.4: (Asymptotic convergence of (fTTj) on weight-balanced digraphs): Let Q be a 
strongly connected, weight-balanced digraph and consider the optimization problem ©, where 
each f \ i E {1, . . . , n}, is convex and differentiable with globally Lipschitz continuous gradient. 
Let K E M>o be the Lipschitz constant of V/ and define h : M >0 — > K by 



1 „ lT , / r 4 + 3r 2 + 2 / /r 4 + 3r 2 + 2\ 2 \ Kr 1 
ft (r) = -A,(L + L-) + M -4 +„, -14. 



where A*(-) denotes the non-zero eigenvalue with smallest absolute value. Then, there exists 
P* E ]R >0 with h(p*) = such that, for all < P < P*, the projection onto the first component 
of any trajectory of (TTTT) with a = ^4^- asymptotically converges to the set of solutions of ©. 
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Moreover, if / has a finite number of critical points, the limit of the projection onto the first 
component of each trajectory is a solution of ©. 

Proof: For convenience, we denote the dynamics dTTT> by \EVdi s -o P t : R nd x R nd — > R nd x R nd - 
Note that the equilibria of \& a -dis-opt are precisely the set of saddle points of F in (|5]). Let 
x* = l n ®x* be a solution of ©. First, note that given any initial condition (x , z ) G M nd x]R nc( , 
the set W ZQ defined by © is invariant under the evolutions of (fTTT ). By Proposition 13 .2f i) and (iii), 
there exists (x*, z*) G Eq(* Q ,. dis . opt ) n W Zo . Consider the function V : R nd x R nd -> R> , 

v i x , z ) = \( x ~ x*) T {x - x*) + -(y( x>z ) - y( X *, z *)) T (y( X ,z) - !/(«•,**)), 

where y( x ,z) — fix + z and fi G IR>o satisfies f3 2 — afi + 2 = 0. This function is quadratic, hence 
smooth. Next, we consider its Lie derivative along ^.dis-opt ° n W Zo . For (x, z) G W Zo , let 

£ = Cv^Vix, z) = (-aLx -Lz- V/(aj), Laj) T VK(a;, z) 

= ~ ((cc - £c*) T , (y {Xt2] - y {x ^ z *)) T ^ A (x, - (x - x*) T Vf(x) 

+ ~ (x T , yjT z) ) A T (a; - x*, y {x , z) - y^.,*.)) - - y( x *,z*)) T Vf(x), 

where A is given by (PT2T) . This equation can be written as 

£ =^ ((* - a;*) T , (y {x>z) - y (a; *, z *)) T ) Q (x-x*, y {x>z) - y {x *, z *)) -(x- x*) T Vf{x) 

+ ((sc - a;*) T , (y (a!)Z) - y {x *, z * } ) T ^ A (x*, y (a: * )Z *)) ~ fi{V{ x , z ) ~ y( x *,z*)) T Vf( x ), 

where Q is given by ([13]). Note that A(x*, y( £C », z »)) T = -(Ly^*), /3Ly (a! » jZ . ) ) T = (V/(aj*), /3V/>*)) T . 
Thus, after substituting for yi X:Z ), we have 

C ((a; - x*) T , (z - z*) T ) T Q (x-x*,z-z*y 

- (1 + P 2 )(x - x*) T (Vf(x) - V/>*)) - - z*) T (Vf(x) - V/V)), (15) 

where 

r-r - ( ^)-, -d + ^A (L+LT) . 



Each eigenvalue of Q is of the form 



- = , ~(/3 4 + 3/3 2 + 2) ± yg* + 3/32 + 2)2 _ 4/3 2 
77 X 2/3 
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where A is an eigenvalue of L + L T . Using the cocoercivity of /, we can upper bound £ as, 

Qu o 



/ 



X 

z 



ar 
z* 



\ 



\Vf(x)-Vf(x*)J 



A. 11 

Q21 

\ 



-P\nd -£(l + /3 2 )Uy 



Vv/( 



x 
z 
x) 



X* \ 

z* 

v/V)/ 



(17) 



"V" 

Q 



where if G IR>o is the Lipschitz constant for the gradient of /. 

Since (x, z) G W Zo , we have (1^ <8> l d )(z — z*) = d and hence it is enough to establish 



that Q is negative semidefinite on the subspace W = {(ui, i>2, t_) G (R nd ) 3 
Using the fact that — -^(1 + (3 2 )\ n d is invertible, we can express Q as 



dm 



o_}. 



Q = N 



i(l + /3 2 )l 



N T , Q = Q + 



nd , 




^nd 





\ 






0K I 







1+/3 2 ,nd 


1° 








Noting that W is invariant under A^ T (i.e., N T W = W), all we need to check is that the matrix 





(l+/3 2 )lnd 



is negative semidefinite on W. Clearly, — i(l + /3 2 )l n d is negative definite. 
On the other hand, on (M nd ) 2 , is an eigenvalue of Q with multiplicity 2c? and eigenspace 
generated by vectors of the form (l n eg) a, 0) and (0, l n (g> 6), with a, 6 G M d . However, on 



{(^l,^2y 



>)nd\2 



ld)t>2 = 0d}, is an eigenvalue of Q with multiplicity d and 



eigenspace generated by vectors of the form (l n ®a, 0). Moreover, on {(fi, U2) G 



Drad\2 



\d) v 2 = 0^}, the eigenvalues of 



K/3 2 



(0 \ 



(1+/3 2 ) V0 \ndJ are (1+/3 2 ) 

multiplicity nd. Therefore, using Weyl's theorem [21, Theorem 4.3.7], we deduce that the nonzero 
eigenvalues of the sum Q are upper bounded by A*(Q) + ■ From ([TBI and the definition 
of h in ([141) . we conclude that the nonzero eigenvalues of Q are upper bounded by h(/3). It 
remains to show that there exists /3* G R >0 with h(/3*) = such that for all < /3 < (3* 
we have h(/3) < 0. For r > small enough, h(r) < 0, since h(r) = — ^A*(L + L T )r + 
0(r 2 ). Furthermore, limy._j.oo h(r) — K > 0. Hence, the existence of /3* follows from the Mean 
Value Theorem. Therefore we conclude Cq, aiisopt V(x, z) < 0. As a by-product, the trajectories 
of (fTT|) are bounded. Consequently, all assumptions of the LaSalle Invariance Principle are 
satisfied and its application yields that any trajectory of (fTTI) starting from an initial condition 
(x ,z ) converges to the largest positively invariant set M in S$ oJiMb y PI W Zn . Note that if 



K/3 2 



with multiplicity nd — d and with 



ZD- 



[x, Z) G Sq, 



a-dis-opt 



,v 



n W zn , then N T 



x—x 
z—z* 
V/»-V/V) 



G ker(Q) x {0}. From the discussion 



above, we know ker(Q) is generated by vectors of the form (l n <g> a, 0), and hence this implies 
that x = x* + \ n ® a, z = z*, and Vf(x) = V f(x*), from where we deduce that x is 
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also a solution to ©. Finally, for (x, z) £ M, an argument similar to the one in the proof of 
Theorem |4. 1 1 establishes (x,z) G Eq(^ Q ,. dis . opt ). If the set of equilibria is finite, convergence to 
a point is also guaranteed. ■ 
Figure \T\ illustrates the result of Theorem 15.41 for the network of Example 15 .21 




(a) (b) (c) 

Fig. 1. Execution of Ql) for the network of Example \52\ with f(x) = e x , f 2 (x) = (z-3) 2 , f(x) = (x + 3) 2 , f 4 (x) = x 4 , 
f (x) — 4. (a) and (b) show the evolution of the agent's values in x and z, respectively, and (c) shows the value of the 
Lyapunov function. Here, a — 3, xo — (1,2,0.3, 1, 1) T , and zo = I5. The equilibrium (as*,z*) is x* = —0.2005 • I5 and 
z* = (1.1784, 4.3717, -4.1598, 2.2598, 1.3499) T . 

Remark 5.5 (Locally Lipschitz objective functions): Our simulations suggests that the conver- 
gence result in Theorem !5.4l holds true for any locally Lipschitz objective function. However, our 
proof cannot be reproduced for this case because it would rely on the generalized gradient being 
globally Lipschitz which, by Proposition IA. 1 1 would imply that the function is differentiable. • 

Remark 5.6 (Selection of a in (fTTI)).- According to Theorem 15.41 the parameter a is deter- 
mined by as a = In turn, one can observe from (fT4l that the range of suitable values 
for (3 increases with higher network connectivity and smaller variability of the gradient of the 
objective function. From a control design viewpoint, it is reasonable to choose the value of /3 
that yields the smallest a while satisfying the conditions of the theorem statement. • 

Remark 5.7 (Discrete-time counterpart of © and ([TT)) ): It is worth noticing that the discretiza- 
tion of © for undirected graphs (performed in flLTI for the case of continuously differentiable, 
strictly convex functions) and (fTTI) for weight-balanced digraphs gives rise to different discrete- 
time optimization algorithms from the ones considered in[JTl, El, 0, B, Q, J6]|. • 

VI. Conclusions and future work 

We have studied the distributed optimization of a sum of convex functions over directed 
networks using consensus-based dynamics. Somewhat surprisingly, we have established that the 
convergence results established in the literature for undirected networks do not carry over to the 
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directed scenario. Nevertheless, our analysis has allowed us to introduce a slight generalization 
of the saddle-point dynamics of the undirected case which incorporates a design parameter. 
We have proved that, for appropriate parameter choices, this dynamics solves the distributed 
optimization problem for differentiable convex functions with globally Lipschitz gradients on 
strongly connected and weight-balanced digraphs. Our technical approach relies on a careful 
combination of notions from stability analysis, algebraic graph theory, and convex analysis. 
Future work will focus on the extension of the convergence results to locally Lipschitz functions 
in the weight-balanced directed case and to general digraphs, the incorporation of local and 
global constraints, the design of distributed algorithms that allow the network to agree on an 
optimal value of the design parameter, the discretization of the algorithms, and the study of the 
potential connections with dynamic consensus strategies. 
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Appendix 

The next result shows that the differentiability hypothesis of Proposition 12.31 cannot be relaxed. 

Proposition A. 1 (Lipschitz generalized gradient and differentiability): Any locally Lipschitz 
function with globally Lipschitz generalized gradient is differentiable. 

Proof: Let / : R d — > R be a locally Lipschitz function and has a globally Lipschitz 
generalized gradient map [flD . Take x E R d and let us show that df(x) is a singleton. Since / 
is differentiable almost everywhere, there exists a sequence of points {x n }^ =1 , where / is 
differentiable such that lim^oo x n = x. Using the set- valued Lipschitz property of df, we 
have df(x) C Vf(x n ) + K\\x n — x\\B(0,l), where K E R>o is the Lipschitz constant and 
B(0, 1) is the ball centered at G R d of radius one. Hence, any element v E df(x) can be 
written as v = Vf(x n ) + K\\x n — x\\u n , where u n is a unit vector in R d . Now, taking the 
limit, v = lim^oo V/(x n ). Hence the generalized gradient is singleton- valued. Differentiability 
follows now from the set-valued Lipschitz condition. ■ 

Lemma A.2 (Generalized gradient flow from a critical point): Let / : R d — > R be locally 
Lipschitz and convex, and let x* be a minimizer of /. Then, the only solution of x(t) E —df(x(t)) 
starting from x* is x(t) = x* , for all t > 0. 

Proof: We reason by contradiction. Assume x(t) is not identically x*. Since / is monotoni- 
cally nonincreasing along the gradient flow, the trajectory must stay in the set of minimizers of /, 
and hence t (->■ f(x(t)) is constant. Let t' be the smallest time such that —df(x*) 3 v — x{t') ^ 0. 
Using |[22l Lemma 1], we have = 4rf(x(t)) = t> T £, for all £ E df(x*). In particular, for 
£ = —v, we get = — ||u |||, which is a contradiction. ■ 
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